118 research outputs found
Power of the Spacing test for Least-Angle Regression
Recent advances in Post-Selection Inference have shown that conditional
testing is relevant and tractable in high-dimensions. In the Gaussian linear
model, further works have derived unconditional test statistics such as the
Kac-Rice Pivot for general penalized problems. In order to test the global
null, a prominent offspring of this breakthrough is the spacing test that
accounts the relative separation between the first two knots of the celebrated
least-angle regression (LARS) algorithm. However, no results have been shown
regarding the distribution of these test statistics under the alternative. For
the first time, this paper addresses this important issue for the spacing test
and shows that it is unconditionally unbiased. Furthermore, we provide the
first extension of the spacing test to the frame of unknown noise variance.
More precisely, we investigate the power of the spacing test for LARS and
prove that it is unbiased: its power is always greater or equal to the
significance level . In particular, we describe the power of this test
under various scenarii: we prove that its rejection region is optimal when the
predictors are orthogonal; as the level goes to zero, we show that the
probability of getting a true positive is much greater than ; and we
give a detailed description of its power in the case of two predictors.
Moreover, we numerically investigate a comparison between the spacing test for
LARS and the Pearson's chi-squared test (goodness of fit).Comment: 22 pages, 8 figure
Rice formulae and Gaussian waves
We use Rice formulae in order to compute the moments of some level
functionals which are linked to problems in oceanography and optics: the number
of specular points in one and two dimensions, the distribution of the normal
angle of level curves and the number of dislocations in random wavefronts. We
compute expectations and, in some cases, also second moments of such
functionals. Moments of order greater than one are more involved, but one needs
them whenever one wants to perform statistical inference on some parameters in
the model or to test the model itself. In some cases, we are able to use these
computations to obtain a central limit theorem.Comment: Published in at http://dx.doi.org/10.3150/10-BEJ265 the Bernoulli
(http://isi.cbs.nl/bernoulli/) by the International Statistical
Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm
Remark on the finite-dimensional character of certain results of functional statistics
4 pagesInternational audienceThis note shows that some assumption on small balls probability, frequently used in the domain of functional statistics, implies that the considered functional space is of finite dimension. To complete this result an example of L2 process is given that does not fulfill this assumption
On the tails of the distribution of the maximum of a smooth stationary gaussian process
We study the tails of the distribution of the maximum of a stationary
Gaussian process on a bounded interval of the real line. Under regularity
conditions including the existence of the spectral moment of order 8,
we give an additional term for this asymptotics. This widens the
application of an expansion given originally by Piterbarg [CITE] for
a sufficiently small interval
CLT for Crossings of random trigonometric Polynomials
International audienceWe establish a central limit theorem for the number of roots of the equation when is a Gaussian trigonometric polynomial of degree . The case was studied by Granville and Wigman. We show that for some size of the considered interval, the asymptotic behavior is different depending on whether vanishes or not. Our mains tools are: a) a chaining argument with the stationary Gaussain process with covariance , b) the use of Wiener chaos decomposition that explains some singularities that appear in the limit when
Selective genotyping pour la détection de QTL
International audienceLes nouvelles technologies en matière de génomique se révèlent être efficaces afin de percer les secrets de la variation génétique d'un caractère quantitatif. Ces technologies permettent la caractérisation moléculaire de marqueurs polymorphes (i.e. présentant plusieurs allèles) sur l'ensemble du génome. Ces derniers seront par la suite utilisés pour identifier et localiser les loci (i.e. emplacements physiques précis sur un chromosome) où la variation allélique est associée à la variation du caractère quantitatif considéré. On nomme QTL de tels loci. Néanmoins, les coûts dûs au génotypage demeurent très élevés. C'est pourquoi l'optimisation du processus expérimental est primordiale. L'un de ces processus expérimentaux s'intitule selective genotyping. Il a été proposé par Lebowitz and al. (1987), et élaboré par Lander et Botstein (1989), Darvasi et Soller (1992), Muranty et Goffinet (1997). Le selective genotyping consiste à génotyper uniquement les individus dont la valeur du caractère quantitatif est extrême (plus grande ou plus petite qu'un seuil). Cela permet de réduire les coûts dûs au génotypage tout en gardant une bonne puissance pour le test statistique, à condition que le nombre d'individus ait été augmenté. Dans cet exposé, sont étudiées différentes stratégies pour l'analyse statistique en selective genotyping. Les tests statistiques correspondants, seront comparés en terme d'efficacité au test oracle, celui où tous les génotypes sont connus
- …